Deterministic dependency parsing of unrestricted English text Master’s Thesis
نویسنده
چکیده
This master’s thesis describes a deterministic dependency parser using a memorybased learning approach to parse unrestricted English text. A converter transforms the Wall Street Journal section of the Penn Treebank to an intermediate dependency representation which is used to train the parser using the TiMBL (Daelemans, Zavrel, Sloot, & Bosch, 2003) library. The output of the parser is labeled dependency graphs, using as arc labels a combination of bracket labels and grammatical role labels constructed from the Penn Treebank II annotation scheme (Marcus, Kim, et al., 1994). The parser reaches a maximum unlabeled attachment score of 87.1% and produces labeled dependency graphs with an accuracy of of 86.0% with the correct head and arc label recognised. The results are close to the state of the art in dependency parsing, and the parser also outputs arc labels that other parsers do not produce.
منابع مشابه
Transition-Based Natural Language Parsing with Dependency and Constituency Representations
Hall, Johan, 2008. Transition-Based Natural Language Parsing with Dependency and Constituency Representations, Acta Wexionensia No 152/2008. ISSN: 1404-4307, ISBN: 978-91-7636-625-7. Written in English. This thesis investigates different aspects of transition-based syntactic parsing of natural language text, where we view syntactic parsing as the process of mapping sentences in unrestricted tex...
متن کاملIncrementality In Deterministic Dependency Parsing
Deterministic dependency parsing is a robust and efficient approach to syntactic parsing of unrestricted natural language text. In this paper, we analyze its potential for incremental processing and conclude that strict incrementality is not achievable within this framework. However, we also show that it is possible to minimize the number of structures that require nonincremental processing by ...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملDeterministic Dependency Parsing of English Text
This paper presents a deterministic dependency parser based on memory-based learning, which parses English text in linear time. When trained and evaluated on the Wall Street Journal section of the Penn Treebank, the parser achieves a maximum attachment score of 87.1%. Unlike most previous systems, the parser produces labeled dependency graphs, using as arc labels a combination of bracket labels...
متن کاملFast, Deep-Linguistic Statistical Dependency Parsing
We present and evaluate an implemented statistical minimal parsing strategy exploiting DG charateristics to permit fast, robust, deeplinguistic analysis of unrestricted text, and compare its probability model to (Collins, 1999) and an adaptation, (Dubey and Keller, 2003). We show that DG allows for the expression of the majority of English LDDs in a context-free way and o ers simple yet powerfu...
متن کامل